skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Wu, Fan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 19, 2026
  2. Large language models (LLMs) are becoming a popular tool as they have significantly advanced in their capability to tackle a wide range of language-based tasks. However, LLMs applications are highly vulnerable to prompt injection attacks, which poses a critical problem. These attacks target LLMs applications through using carefully designed input prompts to divert the model from adhering to original instruction, thereby it could execute unintended actions. These manipulations pose serious security threats which potentially results in data leaks, biased outputs, or harmful responses. This project explores the security vulnerabilities in relation to prompt injection attacks. To detect whether a prompt is vulnerable or not, we follows two approaches: 1) a pre-trained LLM, and 2) a fine-tuned LLM. Then, we conduct a thorough analysis and comparison of the classification performance. Firstly, we use pre-trained XLMRoBERTa model to detect prompt injections using test dataset without any fine-tuning and evaluate it by zero-shot classification. Then, this proposed work will apply supervised fine-tuning to this pre-trained LLM using a task-specific labeled dataset from deep set in huggingface, and this fine-tuned model achieves impressive results with 99.13% accuracy, 100% precision, 98.33% recall and 99.15% F1-score thorough rigorous experimentation and evaluation. We observe that our approach is highly efficient in detecting prompt injection attacks. 
    more » « less
    Free, publicly-accessible full text available July 8, 2026
  3. Free, publicly-accessible full text available April 27, 2026
  4. In today’s fast-paced software development environments, DevOps has revolutionized the way teams build, test, and deploy applications by emphasizing automation, collaboration, and continuous integration/continuous delivery (CI/CD). However, with these advancements comes an increased need to address security proactively, giving rise to the DevSecOps movement, which integrates security practices into every phase of the software development lifecycle. DevOps security remains underrepresented in academic curricula despite its growing importance in the industry. To address this gap, this paper presents a handson learning module that combines Chaos Engineering and Whitebox Fuzzing to teach core principles of secure DevOps practices in an authentic, scenario-driven environment. Chaos Engineering allows students to intentionally disrupt systems to observe and understand their resilience, while White-box Fuzzing enables systematic exploration of internal code paths to discover cornercase vulnerabilities that typical tests might miss. The module was deployed across three academic institutions, and both pre- and post-surveys were conducted to evaluate its impact. Pre-survey data revealed that while most students had prior experience in software engineering and cybersecurity, the majority lacked exposure to DevOps security concepts. Post-survey responses gathered through ten structured questions showed highly positive feedback 66.7% of students strongly agreed, and 22.2% agreed that the hands-on labs improved their understanding of secure DevOps practices. Participants also reported increased confidence in secure coding, vulnerability detection, and resilient infrastructure design. These findings support the integration of experiential learning techniques like chaos simulations and white-box fuzzing into security education. By aligning academic training with realworld industry needs, this module effectively prepares students for the complex challenges of modern software development and operations. 
    more » « less
    Free, publicly-accessible full text available July 8, 2026
  5. Although software developers of mHealth apps are responsible for protecting patient data and adhering to strict privacy and security requirements, many of them lack awareness of HIPAA regulations and struggle to distinguish between HIPAA rules categories. Therefore, providing guidance of HIPAA rules patterns classification is essential for developing secured applications for Google Play Store. In this work, we identified the limitations of traditional Word2Vec embeddings in processing code patterns. To address this, we adopt multilingual BERT (Bidirectional Encoder Representations from Transformers) which offers contextualized embeddings to the attributes of dataset to overcome the issues. Therefore, we applied this BERT to our dataset for embedding code patterns and then uses these embedded code to various machine learning approaches. Our results demonstrate that the models significantly enhances classification performance, with Logistic Regression achieving a remarkable accuracy of 99.95%. Additionally, we obtained high accuracy from Support Vector Machine (99.79%), Random Forest (99.73%), and Naive Bayes (95.93%), outperforming existing approaches. This work underscores the effectiveness and showcases its potential for secure application development. 
    more » « less
    Free, publicly-accessible full text available July 8, 2026
  6. Changes to the storm-scale physical processes of an eastern United States mesoscale convective system (MCS) on 14 May 2018 in response to global warming are quantified using the pseudo–global warming (PGW) numerical method. Climate perturbations in temperature DT and specific humidity DQ of different magnitudes are imposed separately and simultaneously. The mid-twenty-first century environment becomes increasingly unstable with larger DT, promoting more favorable MCS conditions. By the late twenty-first century, however, this warming, which maximizes in the mid-troposphere, results in increased convective inhibition (CIN) and decreased convective available potential energy (CAPE). Midlevel warming also reduces cold pool generation through the downward advection of the relatively warm midlevel air. Consequently, the MCS of interest is weak in the midcentury and propagates discretely over the Appalachian Mountains, while it fails to initiate in the late century. In contrast, projected increases in DQ support more intense MCSs in both the mid- and late twenty-first centuries. Moisture increases are maximized in lower troposphere, increasing CAPE and decreasing CIN. Additionally, the stronger convections generate deeper and denser cold pools. Therefore, storms remain robust as they move over the Appalachian Mountains. However, leeside isolated convective cells, which form due to lee waves in the more unstable environment, and their widespread cold pools reduce the leeside instability. This, in conjunction with the more intense MCS cold pools, leads to rapid MCS weakening in the lee. Experiments with both DT and DQ illustrate that larger magnitude increases in one thermodynamic variable may supersede increases in the other. 
    more » « less
    Free, publicly-accessible full text available July 15, 2026
  7. This study employs a pseudo–global warming approach to investigate precipitation changes from a mesoscale convective system (MCS) on 14 May 2018 over the eastern United States. An Appalachian-Mountain-crossing MCS is simulated for historical, mid-twenty-first century (2045–54), and late-twenty-first century (2090–99) climate scenarios. For experiments using ensemble-mean perturbations in atmospheric, soil, and oceanic variables derived from 34 general circulation models, MCS precipitation diminishes by 25%in the midcentury and 65%in the late century. Experiments testing the sensitivity to these variables separately reveal that atmospheric variables primarily drive precipitation changes. Additional sensitivity experiments quantify MCS responses to temperature, moisture, and wind perturbations separately, with the magnitude of perturbations stratified as low, moderate, or high. Experiments highlight the dominant though contrasting roles of the thermodynamic variables. In midcentury, temperature increases lead to reductions in rainfall rates by up to 74.3%, while increased moisture raises rainfall rates by 75.1%. In the late century, the MCS fails to initiate for temperature perturbations of all magnitudes. Rainfall rate and precipitation area substantially increase with larger moisture perturbations, while the frequency of heavy (95th percentile) and extreme (99th percentile) precipitation increases more than 100%, with minimal changes in precipitation rate. Finally, ensemble-mean perturbations are added to all variables, except for temperature or moisture, to which either a low or high perturbation is added. MCSs are robust when low-temperature or high-moisture perturbations are included, though they fail to initiate for low-moisture and high-temperature perturbations, highlighting the challenges in projecting future MCS behavior. 
    more » « less
    Free, publicly-accessible full text available July 15, 2026
  8. Abstract High-bandwidth applications, from multi-gigabit communication and high-performance computing to radar signal processing, demand ever-increasing processing speeds. However, they face limitations in signal sampling and computation due to hardware and power constraints. In the microwave regime, where operating frequencies exceed the fastest clock rates, direct sampling becomes difficult, prompting interest in neuromorphic analog computing systems. We present the first demonstration of direct broadband frequency domain computing using an integrated circuit that replaces traditional analog and digital interfaces. This features a Microwave Neural Network (MNN) that operates on signals spanning tens of gigahertz, yet reprogrammed with slow, 150 MBit/sec control bitstreams. By leveraging significant nonlinearity in coupled microwave oscillators, features learned from a wide bandwidth are encoded in a comb-like spectrum spanning only a few gigahertz, enabling easy inference. We find that the MNN can search for bit sequences in arbitrary, ultra-broadband10 GBit/sec digital data, demonstrating suitability for high-speed wireline communication.Notably, it can emulate high-level digital functions without custom on-chip circuits, potentially replacing power-hungry sequential logic architectures. Its ability to track frequency changes over long capture times also allows for determining flight trajectories from radar returns. Furthermore, it serves as an accelerator for radio-frequency machine learning, capable of accurately classifying various encoding schemes used in wireless communication. The MNN achieves true, reconfigurable broadband computation, which has not yet been demonstrated by classical analog modalities, quantum reservoir computers using superconducting circuits, or photonic tensor cores, and avoidsthe inefficiencies of electro-optic transduction. Its sub-wavelength footprint in a Complementary Metal-Oxide-Semiconductor process and sub-200 milliwatt power consumption enable seamless integration as a general-purpose analog neural processor in microwave and digital signal processing chips. 
    more » « less
  9. This research-to-practice paper reports students' perceptions on using a teaching framework called authentic learning to learn about information flow analysis. Using information flow analysis, practitioners find the flow of data across one or multiple programs. Information flow analysis is helpful for multiple software engineering activities, such as detecting software bugs and developing software fuzzing techniques. Despite being helpful in practice, learning about information flow analysis remains an impediment for students, which in turn prevents them from reaping the benefits of using information flow analysis. Therefore, an application of a teaching framework can aid students in learning about information flow analysis. To that end, we systematically investigate if authentic learning---a teaching framework that emphasizes on providing hands on experience for a practically relevant topic---is helpful for students to learn about information flow analysis. Upon conducting the exercise, students are asked to participate in a survey where they report perceptions about the conducted exercise. We analyze data from 170 students who were introduced to information flow analysis through an authentic learning-based exercise. From our analysis, we observe: (i) majority of the students to have little to no knowledge about information flow analysis prior to conducting the authentic learning-based exercise; (ii) 74.1\% of the 170 students find the authentic learning-based exercise helpful to learn about information flow analysis; and (iii) student perceptions to vary for the three components of the authentic learning-based exercise. We conclude our paper by describing the implications of our findings for instructors and researchers. For example, instructors should consider the education level of students while designing activities for individual authentic learning components to educate students on information flow analysis. Furthermore, researchers can devise strategies on how instructors can allocate their efforts for each authentic learning component through empirical studies. These studies may investigate the correlation between reported helpfulness and socio-technical factors, such as education level of students. 
    more » « less
  10. This research paper systematically identifies the perceptions of learning machine learning (ML) topics. To keep up with the ever-increasing need for professionals with ML expertise, for-profit and non-profit organizations conduct a wide range of ML-related courses at undergraduate and graduate levels. Despite the availability of ML-related education materials, there is lack of understanding how students perceive ML-related topics and the dissemination of ML-related topics. A systematic categorization of students' perceptions of these courses can aid educators in understanding the challenges that students face, and use that understanding for better dissemination of ML-related topics in courses. The goal of this paper is to help educators teach machine learning (ML) topics by providing an experience report of students' perceptions related to learning ML. We accomplish our research goal by conducting an empirical study where we deploy a survey with 83 students across five academic institutions. These students are recruited from a mixture of undergraduate and graduate courses. We apply a qualitative analysis technique called open coding to identify challenges that students encounter while studying ML-related topics. Using the same qualitative analysis technique we identify quality aspects do students prioritize ML-related topics. From our survey, we identify 11 challenges that students face when learning about ML topics, amongst which data quality is the most frequent, followed by hardware-related challenges. We observe the majority of the students prefer hands-on projects over theoretical lectures. Furthermore, we find the surveyed students to consider ethics, security, privacy, correctness, and performance as essential considerations while developing ML-based systems. Based on our findings, we recommend educators who teach ML-related courses to (i) incorporate hands-on projects to teach ML-related topics, (ii) dedicate course materials related to data quality, (iii) use lightweight virtualization tools to showcase computationally intensive topics, such as deep neural networks, and (iv) empirical evaluation of how large language models can be used in ML-related education. 
    more » « less